Figure 1 

The novel gene as identified through RACE analysis (894 bp) 



GGGAGTGGAGTGAGGGGTAACAAGATGGCGACCGAGACGGTGGAGCTCCATAAGCTAA 

AGCTTGCCGAACTAAAGCAAGAATGTCTTGCTCGTGGTTTGGAGACCAAGGGAATAAAG 

CAAGATCTTATCCACAGACTCCAGGCATATCTTGAAGAACATGCTGAAGAGGAGGCAAAT 

GAAGAAGATGTACTGGGAGATGAAACAGAGGAAGAAGAAACAAAGCCCATTGAGCTCCC 

TGTCAAAGAGGAAGAACCCCCTGAAAAAACTGTTGATGTGGCAGCAGAGAAGAAAGTGG 

TGAAAATTACATCTGAAATACCAGAGACTGAGAGAATGCAGAAGAGGGCTGAACGATTCA 

ATGTACCTGTGAGCTTGGAGAGTAAGAAAGCTGCTCGGGCAGCTAGGTTTGGGATTTCT 

TCAGTTCCAACAAAAGGTCTGTCATCTGATAACAAACCTATGGTTAACTTGGATAAGCTG 

AAGGAAAGAGCTCAAAGATTTGGTTTGAATGTCTCTTCAATCTCCAGAAAGTCTGAAGAT 

GATGAGAAACTGAAAAAGAGGAAGGAGCGATTTGGGATTGTCACAAGTTCAGCTGGAAC 

TGGAACCACAGAGGATACAGAGGCAAAGAAGAGGAAAAGAGCAGAGCGCTTTGGGATT 

GCCTGATGAAAAGTTCCTGATACTTTCTGTTCTCCAGTGTTTTCCATTTCTCTCCTTCTTC 

TTGGTCACATATATGCCTAAATGCACAGTCATGTGCCTACGTCCTGCCTCGCAATGAGG 

GAGCATGTACCCCAGGTACATCCATGAACTGCGGCAGCAGTTTGACTTATTGCTGTTTCA 

GCTTTAAGGTTGTTGTG I I I I I GTTTTTGATTATGTTGCTTGTTAATAAAAAAAAATAGAAA 

A 



Figure 2 

Amino acid sequence as translated from the novel gene (210 amino acids) 

MATETVELHKLKLAELKQECLARGLETKGIKQDLIHRLC5AYLEEHAEEEANEEDVLGDETEEE 
FTK pip, py KF FPPPPK-n/n\/AAFKK\A/KITRFIPQTERMQKRAE RFNVPVSLESK KAARAAR 
FG1SSVPTK GLSSDNKPMVNLDKLKERAQ RFGLNVSSISRK SEDDEKLKKRKER FGIVTSSA G 
TGTTEDTEAK KRKRAERFG1A 

Underlined sequences are amino acid sequences obtained by MS/MS analysis. 



Figure 3 



The sequence of the novel gene amplified through long distant PCR and used to 
construct the expression vector (873 bp). 



TGGAGTGAGGGGTAACAAGATGGCGACCGAGACGGTGGAGCTCCATAAGCTAAAGCTT 

GCCGAACTAAAGCAAGAATGTCTTGCTCGTGGTTTGGAGACCAAGGGAATAAAGCAAGA 

TCTTATCCACAGACTCCAGGCATATCTTGAAGAACATGCTGAAGAGGAGGCAAATGAAG 

AAG ATGTACTGG GAGATG AAAC AGAG GAAG AAG AAACAAAG C CCATTG AGCTCCCTGTC 

AAAGAGGAAGAACCCCCTGAAAAAACTGTTGATGTGGCAGCAGAGAAGAAAGTGGTGAA 

AATTACATCTGAAATACCACAGACTGAGAGAATGCAGAAGAGGGCTGAACGATTCAATGT 

ACCTGTGAGCTTGGAGAGTAAGAAAGCTGCTCGGGCAGCTAGGTTTGGGATTTCTTCAG 

TTCCAACAAAAGGTCTGTCATCTGATAACAAACCTATGGTTAAC7TGGATAAGCTGAAGG 

AAAGAGCTCAAAGATTTGGTTTGAATGTCTCTTCAATCTCCAGAAAGTCTGAAGATGATG 

AG AAACTG AAAAAG AG GAAG GAG CG ATTTG G G ATTGTC ACAAGTTCAG CTGG AACTG G A 

ACCACAGAGGATACAGAGGCAAAGAAGAGGAAAAGAGCAGAGCGCTTTGGGATTGCCT 

GATGAAAAGTTCCTGATACTTTCTGTTCTCCAGTGTTTTCCATTTCTCTCCTTGTTCTTGG 

TCACATATATGCCTAAATGCACAGTCATGTGCCTACGTCCTGCCTCGCAATGAGGGAGC 

ATGTACCCCAGGTACATCCATGAACTGCGGCAGCAGTTTGACTTATTGCTGTTTCAGCTT 

TAAGGTTGTTGTGTTTTTGTTTTTGATTATGTTGCTTGTTAAT 



Figure 4 




A ► * ► 

22 cycles of PCR 26 cycles of PCR 



Figure 5 
P-151 5'-Untranslated Region 

1 75 
CAGGGGCAGCAGTGATTATCTGAACTCGGATCTTTAAAATTGTGGTAGCTCTAAAGCTGATGATGTCTGGTTAGG 

76 150 
AAGTGGCTCTTGCCCGCCCCAGCCCCACCGCCAGTTCCTTAAGCCCGCCCCATGCCCCTCCCAGCTTCCTCCTCA 

151 225 
TGTTCATCGGTTTTTTCAGGGCTCCCTTCAACGCTCCCCTCTCAGTATTTAGGTCACCACTCCCTCGGCGCCCCT 

226 300 
TTCGCCTCCCACCATTTTTCCTCAGCAACCCTTACAGTCTTTGCAGCTCCTACCTGCCAGCTCAGATCCCCGTCC 

301 375 
GGCT ATGGGCGCGGCGCCGGCTACCACACCTGAAGTCTCCAGGAAGTAA CGCCTCTCCTTCTGCCCCTTTCCTGT 

376 450 
TGGAGGAACAGAATCAGCGCTGCCACCACCCATTGGTTGGTGGTCTGT AATGCAGAAGCACAGTTGGTTGCCATT 

451 525 
TCTGTCGTTCGCAAGATACAGTGCCCGCCCCTCTCCCAGTTCCACCTTTTGA AAGAGGTGGGGCAAGCTGCCTAG 

526 600 
AGAAGTGAGAGCGACGTCAGCTATTGACCA ATGGGAAGAGCTGATGGTATGGCGTGGGAGCAAGAGTGA CAACGA 

601 675 
TTGGTCAGCCTTGCATCTCTACGCCTAAGGCGGGAACTCCTGGAGGCGGAGGCCGCGGGTGGGGGGAGTGGAGTG 

676 

AGGGGTAACAAGATG P151 coding region 



(Total length: 690 bp) 



Sequence with asterisk: the 274 bp fragment 

Underlined sequences are the minicistrones or uORFs before the start of the P151 coding 
region with the start and stop codons in bold. 



